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Abstract 

The refinement calculus for logic programs is a framework for deriving logic programs from 
specifications. It is based on a wide-spectrum language that can express both specifications 
and code, and a refinement relation that models the notion of correct implementation. In 
this paper we extend and generalise earlier work on contextual refinement. Contextual 
refinement simplifies the refinement process by abstractly capturing the context of a sub- 
component of a program, which typically includes information about the values of the 
free variables. This paper also extends and generalises module refinement. A module is a 
collection of procedures that operate on a common data type; module refinement between 
a specification module A and an implementation module C allows calls to the procedures 
of A to be systematically replaced with calls to the corresponding procedures of C . Based 
on the conditions for module refinement, we present a method for calculating an imple- 
mentation module from a specification module. Both contextual and module refinement 
within the refinement calculus have been generalised from earlier work and the results are 
presented in a unified framework. 
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1 Introduction 

The construction of programs that are correct with respect to their specifications 
is an important goal of software development. A refinement calculus is a formal 
method for deriving programs from specifications in a step- wise fashion. It is based 
on: 

• a wide- spectrum language that can express both specifications and executable 
programs; 

• a refinement relation that models the notion of correct implementation; and 

• a collection of refinement laws providing the means to refine specifications to 
code in a stepwise fashion. 

The wide-spectrum language contains both specification and implementation con- 
structs, blurring the distinction between specifications and executable code. A series 
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of correctness-preserving refinement laws are applied to a specification, replacing 
specification constructs with implementation constructs. Each refinement law is 
proved with respect to the underlying semantics of the calculus. A law may have 
associated proof obligations^ which must be discharged to ensure the application of 
the law is valid. 

A refinement calculus for logic programs has been developed ( |Hayes et al. 1997| 
[Hayes at al. 2000| [Hayes et al. 2002) 1. In this paper we extend and generalise earlier 
work on contextual and module refinement of logic programs within the refinement 
calculus, and present the results in a unified framework. 

Because our wide-spectrum language is monotonic with respect to the refinement 
ordering, a program, 5, is refined by refining any of its components. We can use this 
property to decompose the refinement of a program into the refinement of (some or 
all of) its components. In many situations a component of S may inherit context 
from S. This context can, for example, provide information about the values of 
free variables in the component. In this paper we provide a framework for making 
context available during the refinement of a program's components. 

We use contextual refinement to reason about module refinement. A module in 
our language is a group of procedures that operate on a common data type. By 
making assumptions about the structure of a program that uses the module we 
derive a context in which efficient implementations of abstract data types are al- 
lowed. Finally, we present a method for deriving, or calculating, an implementation 
module from an abstract module. Starting from the abstract module and a coupling 
invariant — a relation between the abstract and implementation types — a spec- 
ification of the implementation module can be automatically produced (subject to 
some consistency checks). 

The paper is structured as follows. In Sect.[2|the meaning of wide-spectrum lan- 
guage constructs and refinement are informally described. Sect. [3| examines contex- 
tual refinement of logic programs. The contextual refinement laws are illustrated 
with an example of a data refinement. In Sect. 0| we discuss module refinement, 
where we reason about groups of procedures that operate on a common data type. 
In Sect. [Sjwe present a general scheme for deriving an implementation of a mod- 
ule based on the relationship between the specification and implementation types. 
We then specialise the scheme for particular combinations of abstract operations 
and coupling invariants. In particular, Sect. [Hj extends the specification language so 
that nondeterminism in some coupling invariants can be eliminated, allowing more 
efficient implementation modules. In Sect. [3 we discuss related work. 

This paper summarises and extends the first author's thesis l|Colvin 2002|l . We 
combine and extend the results of earlier papers l|Colvin et al. 1998llColvin et al. 200(11 
IColvin et al. 2001)1 and adopt a consistent structure and notation, resulting in a 
simpler and more comprehensive theory for contextual and data refinement. Specif- 
ically, the results of Colvin et al. (2000 ) are generalised by unifying the treatment 
of context for the different constructs in the language (Sect.[S)), and the results of 
Colvin et al. (1998[ are condensed and simplified in the unified notation fSect. lT^ . 



The results of |Colvin et al. (2001 1 are extended by considering more program struc- 
tures and allowing arbitrary predicates as context, and a more complex example is 
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used to present the results (Sect. 2)). We also present a technique for automatically 
calculating implementation modules (Sect. 13, originally proposed in Colvin (2002 1. 
Specialisations of the calculation technique fSect. lS^]! and the use of demonic non- 
determinism in module calculations (Sect.|HI) are novel to this paper. 



2 The wide-spectrum language and refinement 

A wide-spectrum language may be used to express both specifications as well as 
executable programs IjPartsch 1990|l . For example, Back (19881 included specifica- 
tion constructs in Dijkstra's imperative language IP^kstra 1976| ). Using a wide- 
spectrum language has the benefit of allowing stepwise refinement within a single 
notational framework. 



2.1 Basic constructs 

Semantic model. For brevity we present an informal, intuitive description of the se- 
mantics of the language and refinement, and present the main theorems and results 
as high-level refinement laws. The details of a predicate-based semantics appears 
in | |Hayes et al. 1997| ), and of an operational semantics in ( |Hayes et al. 2002| ). 

In our language, a command (logic program fragment) S with free variables 
V constrains (instantiates) V to satisfy S. (This is the same principal involved as 
when a procedure call p{ V) constrains V to satisfy p.) The instantiation of the free 
variables, which may already be partially or fully instantiated, is the "effect" of S, 
similar to a postcondition in Hoare logic. Additionally, every command may have 
an associated "assumption" , similar to preconditions in Hoare logic. Assumptions 
specify the instantiations of the free variables for which the command is guaran- 
teed to function correctly. If the free variables do not satisfy the assumptions, the 
program may behave in any manner (like abort in Dijkstra's language). 

The commands in our wide-spectrum language are discussed below (a summary 
appears in Fig.^). We describe them in terms of their assumptions (input instan- 
tiations) and effect (output instantiations). Throughout the paper we adopt the 
following naming conventions. 

A,B.. predicates (inside assumption commands) 

P,Q.. predicates (inside specification commands) 

S,T.. commands 

V ,X ,Y variables 

U terms 



Specifications. A specification (P) constrains the instantiations of its free variables 
so that they satisfy predicate P; it is the basic building block of programs in the 
wide-spectrum language. For example, the specification (X = 5 V X = 6) represents 
the set of instantiations {5,6} for X. We define two special cases of specification 
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{P) 
{A} 



specification 

assumption 
disjunction 



{S V T) 
{S A T) 
{S, T) 

(3 y • 5) 

(V F • 5) 

pc{U) 



sequential conjunction 
existential quantification 
universal quantification 
procedure call 



parallel conjunction 



Fig. 1. Summary of commands in the wide-spectrum language 

commands: 

fail = (false) 
true = (true) 

The specification fail is not satisfied by any instantiation of free variables; it is 
like Prolog's fail. The specification true does nothing, i.e., does not constrain 
the instantiations; it is like Prolog's true. Specification commands operate on any 
input instantiations, that is, their assumption is always true. 

Assumptions. An assumption {A}, where ^ is a predicate, acts as a precondition, 
and thus restricts the input instantiations. As such, it provides a context for a 
program fragment. For example, some program S may require that an integer pa- 
rameter be non-zero, which can be expressed as "{X ^ 0}, 5". If the assumption 
does not hold, the program may abort. Aborting includes program behaviour such 
as non-termination and abnormal termination due to exceptions like division by 
zero, as well as termination with arbitrary results. We define the (worst possible) 
program abort: 

abort = {false} 

The program abort is thus undefined for any input instantiations. 

Program Operators. The disjunction of two programs (5* V T) behaves similarly to 
logical disjunction. The output instantiations of a disjunction is the union of the 
instantiations of the two programs. There are two forms of conjunction: a parallel 
version (5 A T), where S and T are executed independently and the intersection 
of their instantiations is formed on completion; and a sequential form (5, T"), where 
S is executed before T, and hence T can rely on the context established by S. 

Quantifiers. The existential quantifier {3V»S) generalises disjunction, computing 
the union of the results of S for all possible values of V. Similarly, the universal 
quantifier {y V • S) generalises conjunction, computing the intersection of the 
results of S for all possible values of V. 



Procedure call. A procedure call is of the form pc{U), where pc is a procedure and 
[/ is a list of terms. 
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V :- S - procedure 
rep • T/:-C(p)er - recursive procedure 
id = proc - procedure definition 

Fig. 2. Summary of procedure definitions 

2.2 Procedure definitions 

A summary of the syntax associated with procedures is given in Fig. |2| 

Procedures. A (non- recursive) procedure is of the form V :- S , where F is a hst of 
formal parameters and S is the body of the procedure (a command) . 

Recursive procedure. A recursive procedure has the form rep • V :- C{p)er. Its 
body, C{p), encodes zero or more recursive cahs to p. To guarantee termination, 
the actual parameters of the recursive calls must be less than the formal parameters 
( V) according to some well-founded relation. 

Procedure definition. A procedure definition is of the form id = proc, where id is 
the name of the procedure and proc is a (recursive or non-recursive) procedure. 

A distinguishing feature of the refinement calculus when compared to most logic 
program synthesis schemes is the inclusion of assumptions. This allows one to easily 
distinguish between what is assumed by a program and what the program must 
establish. This is useful when defining procedures; often a procedure assumes the 
type of some of its parameters, e.g., {X G list{N)}. This assumption may simplify 
the refinement — without it some of the desired properties of the parameter cannot 
be used. Alternatively a procedure may be specified to establish the type of one 
of its parameters, by giving the type in a specification rather than an assumption, 
e.g., {X G list{N)). In logic programming terms, in the case where a type is given 
in an assumption the actual parameter to the procedure must be bound to a term 
of that type. The actual parameter must satisfy whatever assumptions are made 
about it, or the procedure may abort. 

Example. We may specify a procedure reverse that relates a list with its reverse. 
We assume list indices start at 1. 

reverse = (L, R):- 
{list{L)}, 

{list{R) A #L = A 
((Vz:l..#L-L(*) =i?((#L-*) + l))) 

We have defined reverse to be a procedure with formal parameters L and R. 
Within the body of the definition, we assume that L is a list, giving the type of L 
as well as ensuring that L must be bound before a call to reverse. The procedure 
is then required to establish that i? is a list of the same size as L, and that the 
elements of R are the same as those of L, but in reverse order. 
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A more concrete implementation of the reverse specification is given by the fol- 
lowing recursive program^. 

Definition 2.1 {Reverse of a list) 

reverse C verev • (L, R):- 
(L=[]Ai?=[]) V 
{3H,T,{L^[H\ T]), 

{3RT • append{RT, [H],R) A rev{T , RT))) er 

We have a recursive block that uses the name rev for recursive calls. The body is 
a disjunction; the first disjunct is the base case where L is empty, and therefore R 
is also empty. The second disjunct is the recursive case, where L is nonempty. We 
reverse the tail of L with the recursive call rev{T, RT), and append the head of L, 
H, onto the end of RT {append defines the relationship between three lists where 
the third is the concatenation of the first two) . 

2.3 Refinement 

Program S is refined by program T, written S' C T", if T" aborts less often than S, 
and when S does not abort, T produces the same answers as S. Program equivalence 
(□) is defined as refinement in both directions. 

This definition of refinement does not allow the reduction of nondeterminism that 
imperative refinement allows; in logic programming we are interested in all possible 
solutions, and hence any refinement must also return all of those solutions. 

2.4 Refinement laws 

In this section we present some basic refinement laws^. Each law represents a re- 
finement (synthesis/transformation) that may be made. Where a law is divided 
into two parts by a horizontal line, the part above the line is the proof obligation 
that must be satisfied for the refinement below the line to be applied. For example, 
haw ^{weaken assumption) allows an assumption {A} to be refined to {B}, if A 
entails B. This corresponds to reducing the conditions under which the program 
can abort, haw^ {equivalent specifications) allows the characteristic predicate of a 
specification to be replaced with an equivalent predicate. This corresponds to main- 
taining the set of answers for free variables. These two laws embody the definition 
of refinement; they are the main laws we use for manipulating predicates. 

Law 1 { Weaken assumption) Law 2 {Equivalent specifications) 

A^ B P=Q 
{A] □ {B} {P) □ (Q) 

^ A refine ment of the abstract reverse definition to the recursive version may be found in 
[Colvin (2002 J . 

^ All refinement laws used in this paper have been proved correct with respect to the semantics 
of the language iColvin 2002J . 
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An entailment A ^ B holds if and only if A ^ B holds for all possible values of 
the free variables in the predicates A and B. The equivalence operator = is defined 
as entailment in both directions. 

IjOw ^ (monotonicity of parallel conjunction) is an example of a monotonicity 
law. In general, a monotonicity law states that the refinement of a component of a 
program refines the entire program. In this case, if S refines to S' and T refines to 
T' then the parallel conjunction SAT refines to S" A T' . Monotonicity holds for 
all the operators and both quantifiers in the wide-spectrum language. 

Law 3 [Monotonicity of parallel conjunction) 

s A T n s' A r 



3 Contextual refinement 

During refinement we often focus on a component of a program and refine it, re- 
sulting in a refinement of the entire program, i.e., our wide-spectrum language is 
monotonic with respect to refinement. In many situations the larger program can 
provide context that assists in the refinement of a component. This context can be 
used, for instance, to discharge proof obligations. In this section we introduce a gen- 
eral notion of context to the calculus, and demonstrate its use with the refinement 
of a list-reversal procedure. The approach taken is particularly useful when using a 



refinement tool, as demonstrated in Hemer et al. (2001 1. The tool can manage the 



context, instead of the user having to explicitly pass the context around in the form 
of assumptions. 



3.1 Context in refinement laws 

Some laws, such as haw ^ {equivalent specifications) are "stand-alone" laws. Its 
premise, P = Q, requires that P must be equivalent to Q, regardless of the context 
in which it appears. However, we may wish to reference the context in order to 
discharge this proof obligation. To do this, we introduce a generalised form of 
Law El 

A^jP^Q) 
{A},{P)n{A},{Q) 

This law allows assumptions to be used in the proof that P is equivalent to Q. We 
say the specification (P) has A in context. Since we often encounter laws where a 
refinement occurs with respect to some context we introduce an abbreviation. 

^ Ih 5 C 5" = □ {A},S' 



This is similar to the notation used by Nickson and Hayes (19971 for contextual 
refinement of imperative programs. Thus the generalised form of Law|21is written 
as 
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Law 4 {Equivalent specifications w.r.t. context) 
A\^ {P) □ (Q) 

The following law is similar to haw ^ (monotonicity of parallel conjunction), 
except that context for a parallel conjunction is inherited by both conjuncts. 

Law 5 {Monotonicity of parallel conjunction) 
Ih 5 C S'); Ih T □ T') 

To refine 5 A T in context A, we may refine either of the conjuncts S or T 
using A as the context. There are similar contextual monotonicity laws for the 
other constructs in our language. Such laws allow the context to be passed around 
in a straightforward manner, and for this reason we do not explicitly mention the 
application of such laws in refinements. 

For a sequential conjunction (5, T), command S is executed before T, and hence 
S establishes a context for T. For example, in the program {X = 1), (Y = X + 1), 
the first component establishes X ~ 1, and this may be assumed when refining the 
second component, e.g., the second component may be refined to (y = 2). LawEI 
gives the general rule when the first component is an assumption, and Law when 
it is a specification. 

Law 6 {Assumption in context) Law 7 {Specification in context) 

AhBh{T\^T') AAP\\~{TnT') 
A Ih {B}, T C {B}, T' A Ih (P), T C (F), T' 

Using Law we may refine T with B in context in addition to A, and similarly 
with P in Law {7\ This information may be used to discharge proof obligations in 
the refinement of T to T' . 



3.2 Contextual data refinement 

In this section we use contextual refinement to demonstrate data refinement, where 
a variable of an abstract type is replaced with one or more variables of a concrete 
type. Data refinement may be used to replace a specification type with an imple- 
mentation type, or to improve the efficiency of a program. The abstract and concrete 
types are related by a coupling invariant, which is used to provide context for the 
data refinement. As an example, we show part of the refinement of the simple im- 
plementation of reverse fDefinition l2.1|) on lists to a more efficient implementation 
using difference lists (sometimes referred to as an accumulator implementation) . In 
Sect. l3.2~Tl we data refine reverse assuming that the couping invariant holds in con- 
text; in Sect. IXT^ we complete the data refinement by showing how the coupling 
invariant context can be established efficiently and transparently. 
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3.2.1 Coupling invariant in context 

We refine a procedure call reverse{L, R) in a context in which the list R is repre- 
sented by the difference list {DLl, DL2), i.e., 

R^ DL2 = DLl (3.1) 

The operator represents list concatenation, thus i? is a prefix of DLl and DL2 
is a suffix of DLl. When this relationship holds, R = DLl — DL2 (interpreting ' — ' 
as list difference). 

We begin the refinement of reverse{L, R), with the coupling invariant as an as- 
sumption (the context for the refinement). 

{R ^ DL2 = DLl}, reverse{L, R) 

We expand the call reverse{L, R) from Definition 12. II 

{R ^ DL2 = DLl}, 
{L=[]AR=[])V 
{3H,T.{L=[H\ T]), 

{3RT • append{RT, [H],R) A rev{T,RT))) 

Because program disjunction is monotonic with respect to refinement and the 
context of the disjunction is inherited by its disjuncts, we may refine the first 
disjunct, {L = [] A R = []), with the coupling invariant in context. Using Law 01 
(equivalent specifications) we rewrite i? = [] to DLl = DL2, since the context 
implies they are equivalent expressions. 

{L=[]A DLl = DL2) 

The details of the refinement of the second disjunct are more complex, requiring 
the introduction of a recursive call. We omit the details for brevity, though the full 
refinement can be found in Colvin (2002 1 . The resulting recursive program is the 
usual difference list implementation of reverse. 

reversedl re revdl • (L, DLl, DL2):- 
{L^[]A DLl = DL2) V 

{3H,T»(L=[H\ T]),revdl{T,DLl,[H \ DL2]))ev 
The refinement can be summarised by the following relation: 

R ^ DL2 = DLl Ih reverse{L, R) □ reversedl{L, DLl, DL2) (3.2) 



3.2.2 Data refinement by establishing context 

In the previous section a context was given that allows calls to reverse to be re- 
placed with calls to the more efficient procedure reversedl. However establishing 
this context in arbitrarily large and complex programs may not be feasible. In this 
section we show how the problem can be avoided by implementing reverse in terms 
of reversedl. 
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We start by choosing a stronger couphng invariant than (|3.1|) . in which DLl is 
equal to R and DL2 is the empty list. 

R = DLIADL2^[] (3.3) 

Hence we may deduce reverse{L, R) C reversedl{L, R, []) because (|3.3|l implies the 
premise of (|3.2|l . This is a valid refinement in any context. Of course, in a program 
that makes many calls to reverse, we may hide this change by implementing the 
body of reverse as just a call to reversedl{L, R,[]). The (new) body of reverse 
provides the context of (|3.3|l locally, avoiding the need for the calling program to 
establish the context. 

The above refinements are examples of data refinement on procedures. In the 
next section we consider data refinement on groups of procedures that operate on 
a common data type. 

4 Modular logic program refinement 

In this section we introduce the notion of a module, which is a group of procedures 
that operate on a common data type. By making some assumptions about the 
context in which an abstract module may be used, we may allow a more efficient 
module to be used in its place. 

4-1 Module specifications 

As with modules in logic programming languages such as Mercury ( |Somogyi et al. 1995| ) 
and Godel ( |Hill and Lloyd 1994| ), modules in the wide-spectrum language are collec- 
tions of procedures that operate on a common data type. The data type is intended 
to be opaque, that is, the implementation of the type is hidden, and variables of 
that type may only be manipulated via the procedures of the module. 

We split the opaque parameters of a module procedure into two categories, in- 
put and output, which correspond with the logic programming modes "ground" 
and "var" (unbound), respectively. Upon a procedure call, opaque inputs must 
already have been instantiated to the module type and opaque outputs must be 
uninstantiated. In addition, procedures may have a set of regular, i.e., non-opaque, 
parameters. 

Fig. 121 defines a module Partial Function that declares operations on a type 
pfun. The type pfun is a partial function from elements of its domain type a to 
elements of its range type r, written cr -*-> r. A function may be modeled as a set 
of pairs. A partial function is a function that may be undefined for some elements 
of its domain, as distinct from a total function which maps every element of its 
domain to some value. We have left the actual types for a and r unspecified since 
none of the operations depend on these types (though later we will assume that 
a hash function exists for a) — we can therefore consider Partial Function to be 
polymorphic. Within the module the type signature of each procedure is declared. 
Opaque inputs have an assumption about their type and the specification of each 
procedure guarantees that the opaque outputs are instantiated to be of the opaque 
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Module Partial Function 
Type pfun = cr -t-> r 

init: F': pfun^ 

update: K: a, V: T, F: pfun ., F': pfun^^ 

access: K: a, F: pfun^, V: t 

remove: K: a, F: pfun^, F' : pfun^ 

init = F' :- {F' = {}) 

update = {K, V, F, F') :- {F e pfun A K e a A V e r}, {F' ^ F ® {{K, V)}) 

access = {K, F, V) :- {F G pfun AK £a},{K £ dom{F) A V ^ F{K)) 

remove = {K, F, F') :- {F G pfun AK ea], {F' = {K} O F) 

End 



Fig. 3. Abstract partial function module 



type. Opaque inputs and outputs are subscripted with i and o, respectively. The 
parameters of type a and r (K and V) are regular parameters. 

In the definition of update, the symbol stands for function override; the func- 
tion f (B g is the same as function /, except with all elements in the domain of 
g mapped according to g. Therefore, F © {{K, V)} is the same as F but with K 
mapped to V instead of F[K). In the definition of remove we use domain sub- 
traction the function {K} <g F is the same as F, except K is no longer in the 
domain. 



Following the data type terminology of Liskov and Guttag (19861, a procedure 
with no opaque inputs is referred to as an initialisation procedure; for example, 
init is an initialisation procedure which instantiates the opaque output F' to the 
empty function (represented by the empty set of pairs). A procedure with no opaque 
outputs is referred to as an observer] for example, access is an observer that fails 
if the regular parameter K is not in the domain of the opaque input function F, 
and instantiates the regular parameter V to F{K) otherwise. A procedure with 
both opaque inputs and outputs is called a constructor; for example, the procedure 
update has an opaque output F' , which is the opaque input F updated by the 
pair {K , V). A constructor can be likened to updating the state in an imperative 
module. Note that init, update, and remove all guarantee that their opaque output 
is an element of pfun. 



4-2 Using modules 

Our intuition is that a module is to be used opaquely in the construction and main- 
tenance of some data structure throughout multiple procedure calls. We therefore 
consider programs whose procedure calls are ordered so that the intended modes 
of the opaque inputs and outputs are satisfied, and the variables used as opaque 
inputs and outputs are local to the program. For instance, consider the following 
program that uses the module Partial Function. It inserts the pairs (o, 2) and (6, 1) 
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into a function and accesses the value for a. 

{3F • imt{F), (3 F' • update{a, 2, F, F'), 

(3 F" • update{b, 1, F' , F"), access{a, F",X)))) ^^'"^^ 

The use of sequential conjunction reflects the notion of the changing state and also 
allows the assumptions of the later calls to be satisfied. Initially, F is instantiated 
to the empty function. The two calls to update update F to F' and then to F". 
Overall, the only variable we are interested in is X — the opaque parameters are 
local because they are existentially quantified when they are used as an output. 
By only dealing with programs of this form, we can use contextual information to 
derive more efficient implementations of the module. 

To formalise this notion, we say a program is in output- quantified form with 
respect to a module A4 if, for all procedure calls p{ V, I, O) where p is in and V 
stands for the regular parameters, the opaque inputs / are bound and the opaque 
outputs O are not bound before the call. Also, the opaque variables must not be 
used except by procedures in M.. We first define open output-quantified form, which 
is a generalisation of output- quantified form. 

Definition 4-2 {Open output- quantified form) 

We say a program is in open output- quantified form w.r.t. a module A4 and a set 
of free opaque variables IV if it is in one of the following forms: 

1. a program fragment that does not rely on the opaque variables in IV nor 

make calls on any of the procedures in A4; 

2. a program of one of the following forms, 

Ci VC2 
C1AC2 

Cl,C2 

(3F.Ci) 
(V V •Ci) 

where Ci and C2 are subcomponents that are in open output-quantified form 
w.r.t. A4 and IV, and 1^ is a regular (non-opaque) variable; or, 

3. a program of the form 

{3 0»p{V,I, 0),C) 

where p is a procedure in A4 and V, I, and O are the regular, opaque input, 
and opaque output parameters, respectively, of p. The opaque inputs / must 
be a subset of IV . The component C must be in open output- quantified form 
w.r.t. Ai and the set IVU{0}. When p has no outputs, i.e., it is an observer, 
there are no quantified variables and the corresponding form is just p{ V, I),C. 

Definition 4-3 {Output- quantified form) 

We say a program is in output- quantified form w.r.t. a module A4 if it is in open 
output-quantified form w.r.t. A4 and contains no free opaque variables. 
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Because logic programs do not typically have "state" , we must pass the opaque 
parameters explicitly, and hence in some sense the implementation details are ex- 
posed. However, programs that are in output-quantified form are restricted to only 
using the opaque type and variables via the procedures of the module. This ensures 
the module is used as intended, i.e., with the type being opaque. 

Since the type is opaque, a program in output-quantified form is amenable to 
syntactic simplification that hides the opaque variables. The opaque variables are 
locally quantified, and typically appear as input/output pairs, thus we can adopt a 
shorthand similar to that of definite clause grammars (DCGs) in Prolog (and other 
logic programming languages). For instance, we could write program (|4.1() thus: 

init, update{a, 2), update{h, 1), access{a, X) (4.4) 

At each call to a procedure from the module, from form |31 in Definition 14.21 we 
can immediately identify that a new output opaque variable must be quantified 
(except in the case of the observer, access), and fill in the in/output parameters of 
the procedure call appropriately (resulting in program (|4.1(l ). However, syntactic 
simplifications like this restrict expressiveness. For instance, by hiding the state we 
have no easy way of having two instances of the state active at one time (imper- 
ative languages without opaque types also have this problem). For instance, the 
shorthand notation cannot be used to simplify the following program, which has 
two different partial functions G and H, containing (&, 1) and (&,2), respectively. 

(3 F • init{F), (3 G • update{b, 1, F, G), {3 H m update{b, 2, F, H), . . .)) 

For this reason we use the more general notation in which opaque variables are 
explicit. 

4-3 Module refinement 

In general, we say a module M. is refined by a module M.^ if, for all possible 
programs S using calls to AA, S is refined by the program obtained by replacing 
all calls to the procedures of AA by calls to the corresponding procedures of A^+. In 
this section we consider a law for module refinement fTheorem l4.5|l that can be used 
only if the programs using the module are in output-quantified form fDefinition l4.3|l . 

Consider the Partial Function module defined in Fig. 13 A program that uses it, 
e.g., H4.1|l . is not directly implenientable, since the module uses the abstract partial 
function type which is not part of the implementation language. We would like to re- 
place the calls to init, update, remove, and access from the Partial Function module 
with corresponding calls on a module that implements the operations on an imple- 
mentation data type. Of course, replacing the references to the Partial Function 
module with references to the implementation module must result in a refinement 
of the program in question. The following is our theorem for module refinement. 
As with the data refinement example in Sect. 13.21 we require a coupling invariant 
{CI) to relate the abstract and concrete types. 

Theorem 4-5 {Module Refinement) 
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Assume the following: modules M and M.~^, with associated opaque types S and 
S"*", respectively; a coupling invariant CI, that relates the types E and E"*"; and all 
corresponding pairs of procedures p and p+ from Ai and A^+, respectively, satisfy 
Condition 14.61 below, using CI. Then a program, C, which is in output-quantified 
form w.r.t. Ai, is refined by the program C+, which is structurally the same as C 
except with procedure calls to module A4 replaced by corresponding procedures 
calls to module 

Proof. The theorem is proved by structural induction over programs in open output- 
quantified form. A detailed proof can be found in IjColvin 2002|) : it is a generalised 
version of the proof in IjColvin et aJ. 2nm|i . □ 

Consider the abstract and concrete procedures p and p+ which are defined as 
follows. 

p^{VJ,0):-{A},{P) 

p+ ^ {V,I+,0+):-{A+},{P+) 

The variables in / and are of the abstract opaque type S, and similarly the 
variables in /+ and 0+ are of the concrete opaque type S+. The regular variables, 
V, may be of any other type. The free variables of the assumption {^4} are restricted 
to V and /, and the free variables of the specification (P) are restricted to V, I 
and O. Corresponding restrictions apply to {A'^} and {P~^). The following predicate 
describes the conditions that must hold between procedures p and p+ with respect 
to the coupling invariant CI. 

Condition 4-6 



CI{I,I+) AA^ (4.7) 

A+ A (4.8) 

{P ^ {3 0+ • P+ A CI{0,0+))) A (4.9) 

(P+ ^ (3 • P A CI{0, 0+))) (4.10) 

This condition states that, assuming the inputs are related by the coupling invari- 
ant and the abstract assumption A holds (|4.7|l : the concrete assumption holds H4.8|l : 
every abstract answer has a corresponding answer in the concrete implementation 
(14. 9() : and every concrete answer is related to an answer in the abstract procedure 

Km . 



4 ■ 4 Example 

Concrete type. For our implementation of the Partial Function module, we assume 
the existence of an injection'^, hash, that uniquely maps elements of type cr to a 
natural number in the range 0..N — 1. With this assumption, we may implement 



By requiring hash to be an injection we are assuming that no two keys will map to the same 
natural number and hence avoid the problem of clashes. A more general approach that handles 
clashes is possible, but would complicate the presentation. 
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a partial function as an array, the indices of which are the hashed values of cr. In 
other words, the array acts as a hash table. We define the type hashtable as an array 
of size A^, the elements of which are either the range type r or the special element 
null (not an element of r). 

hashtable = {0..N - 1) ^ (r U {null}) 

The symbol '— indicates a total function, which in this case models an array. 

Coupling invariant. Now that we have defined the concrete type, we give a coupling 
invariant that relates a partial function to a hash table H: 

H = makehash{F) (4.11) 

where makehash{F) = {i: 0..N - 1 • [i, null)} © {{K, V): F • {hash{K), V)}. We 
have written makehash as a set comprehension. In general, a set comprehension 
{a;: T • e(x)} represents the set of values of the expression e{x) for each element x 
of type T. For example, {i: 0..N — 1 • (i, null)} is the set of pairs (i, null) for each 
number i in the range 0..N — 1. Thus makehash{F) is a mapping from hash{K) to 
V for all pairs {K , V) appearing in the function F, with all other numbers mapping 
to null. We assume we have available a module that implements operations such as 
updates and accesses on arrays in constant time, e.g., the array module in Mercury 
dSomogyi et al. 1995| ). We note the following property: 

F G pfun A H ^ makehash{F) ^ H e hashtable (4-12) 

Condition [^.6] for procedure update. As an example instantiation of Condition l4.6l 
we prove that, given the coupling invariant (|4.11() . the following procedure is a valid 
array implementation of update (from Fig. |2J . 

update S {K, V,H,H'):- 

{H e hashtable AK ea A V eT},{H' = H (S {{hash{K), V)}) 

This can be implemented efficiently in Mercury by using the set predicate from 
the array module. 

First we show H4.7|l entails ()4.8|l . 

H = makehash{F) A F e pfun A K e a A V £ t ^ 
H e hashtable A K e a A V er 

The conditions on K and V hold trivially, and H € hashtable follows from H4.12|l . 
We would normally expect H4.8(l to be shown this easily. 
Now we show the rest of Condition 14.61 holds . 

H = makehash{F) A F e pfun A K ea A V ^ 
[F' = F(B{{K, V)} ^ 

{3 H' • H' ^ H ® {{hash{K), V)} A H' = makehash{F'))) A 
[H' ^ H® {{hash{K), V)} ^ 

{3F' • F' = F ® {(iC, V)} AH' = makehash{F'))) 
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Simplifying using the one-point rule we get: 

H = makehash{F) A F e pfun A K e a A V Gt^ 

(F' = F(B {{K, V)}^ H® {{hash{K), V)} = makehash{F')) A 
[H' = H® {{hash{K), V)} ^ H' = makehash{F {{K, V)})) 

Now we simplify the implications, combining them into a single stronger predi- 
cate. 

H = makehash{F) A F E pfun A K e (7 A V e t ^ 
H © {{hash{K), V)} = makehash{F © {{K , V)}) 

Thus we must show that, given that the coupling invariant for the inputs F and H 
holds and that F and the regular parameters are of the correct type, the coupling 
invariant holds for the output values. We prove the conclusion by manipulating the 
expression H © {{hash{K), V)}. 

H®{{hash{K), V)} 
= from antecedent H — makehash{F); definition of makehash 

{i: 0..N - 1 • (i, null)} © {{X, Y): F • {hash{X), Y)} © {{hash{K), V)} 
= Since F is a function and hash is an injection 

{i: 0..N - 1 • (i, null)} © {{X, Y): F © {{K, V)} • ihash{X), Y)} 
= Definition 

makehash{F © {{K, V)}) 

Thus we have proved Condition 14 . 61 for update. The full array implementation of 
the Partial Function module is shown later (Fig.jSl; the remaining procedures are 
derived using techniques described in Sect. [S] 

5 Calculating a concrete module 

In the previous section we described the conditions that must hold between two 
modules with respect to a coupling invariant to allow module refinement. In this 
section we show how those conditions may be used to calculate a concrete module, 
given an abstract module and an appropriate coupling invariant. The procedures 
of the calculated module are guaranteed to satisfy Condition 14.61 with respect to 
their corresponding abstract procedures. After introducing the general form of a 
calculated concrete procedure, we specialise the technique based on the determinism 
of the coupling invariant and the abstract procedure. 

5. 1 General form of concrete procedure 

The following theorem gives the general form for the concrete procedure given the 
abstract procedure and the coupling invariant. 

Theorem 5.1 {Module calculation) 

Given a procedure p = {V ,1,0) :- {A}, (P) with at most V and / free in A, and at 
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most V, I and O free in P, if p and the coupling invariant CI satisfy the foUowing 
properties for some predicate R which is independent of /, 

CI{I, /+) A yl A P ^ (3 0+ • CI{0, 0+)) (5.2) 
CI{I,I+) AA^ 

{30 • P A CI{0,0+)) ^ R{V,I+,0+) (5.3) 

then the foUowing implementation of the concrete procedure satisfies Condition l4.6l 

p+ ^ {V,I+, 0+):- 

{(3/. C/(/,/+) A^)}, (5.4) 
((V/ • CI{I,I+) AA^{3 0»PA CI{0, 0+)))) 

With this theorem we may immediately derive a concrete module from an abstract 
module that will satisfy Condition l4.6l provided the coupling invariant satisfies (|5.2(l 
and H5.3|l . 

Proof. ^ To prove that H5.4|l satisfies Condition l4.6l we prove that it satisfies (|4.8|l , 
l|4.9|l and (|4.1()(l . assuming H5.2(l . (|5.3(l . 14.7|l . and that the outputs and 0+ do 
not occur free in the assumptions A and A^ , respectively. 

gSl Substitute {31 • CI{I, /+) A A) for A+ in gS)); then gSJ holds from 
Pini Substituting (V/ • CI{I,I+) AA^{3 0»PA CI{0, 0+))) for P+ in 
l|4.1()(l . with bound variable / renamed to X, gives the following. 

(VX • CI{X,I+) A A[f] (3 • P[f] A CI{0, 0+))) ^ 
{30»P A CI{0,0+)) 

Since we have CI{I,I~^) A ^ in context H4.7|l . from the implication of the 
universally quantified predicate we may deduce (3 O • F A CI{0, O^ j). 
dm Substituting (V/ • CI{I,I+) AA^{3 0»PA CI{0,0+))) for P+ in 
(|4.9|) . with variable renaming to avoid clashes, gives the following. 

P^{3 0+ • 

(VX • CI{X,I+) A A[^] ^{3Y • P[^] A CI{Y, 0+))) A 
CI{0,0+)) 

We simplify the middle line to true, assuming P and CI{0, 0^). 

{\/X . CI{X,I+) A A[^] ^ (3 r . P[^] A CI{Y, 0+))) 
<^=> We use (15. 3|) on the quantification over Y . 

(VX • CI{XJ+) A A[^] => R{V,I+, 0+)) 
o Reduce the scope of X 

{3X • CI{X,I+) A A[^]) ^ R{V,I+, 0+) 
<^4> X is witnessed by / from (|4.7|l 

R{V,I+,0+) 
<^=> We now use H5.3II again, from (|4.7|l in context 



* This is a simplified version of tfie proof that originally appeared in [Colvin 
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{3Y.P[^]^CI{Y,0+)) 
^ Y is witnessed by O from assumptions P and CI{0, 0+) 
true 

With the middle hne simphfied, we are left with 

P ^ (3 0+ • CI{0, 0+)) 
Using H4.7|l from the context, this follows from H5.2|l . 

□ 

The first assumption (|5.2|l of Theorem 15 .U rea uires that the effect of the abstract 
procedure implies that its output, O, has some concrete representation. This is typi- 
cally just a type check on O, since it is the only free variable in(3 0+» CI{0, 0~^ j). 
One would always expect H5.2|l to hold, and in general it can be trivially discharged. 
The second assumption H5.3(l requires that the expression {3 • P A CI{0, 0+)), 
in a context including CI{I,I~^) A A, has some equivalent form R that does not 
include a free occurrence of the abstract input /. In practice, one does not need to 
explicitly discharge H5.3|l . The given form of the concrete procedure (|5.4|l still in- 
volves the abstract type via the coupling invariant (on both the input and output). 
One will need to simplify the concrete procedure to remove the abstract data type; 
once the abstract input has been removed (if possible), H5.3|l has been satisfied. 

Both constraints (|5.2(l and H5.3(l can be used as a consistency check for the entire 
abstract module and chosen coupling invariant, prior to calculating the concrete 
module. As mentioned, condition H5.2|) fails for a procedure p if p does not maintain 
the abstract type for its output as expected by the coupling invariant. Condition 
H5.3|l fails when information in the abstract type is lost in the transformation to 
the concrete type, and the abstract procedures make use of that information. For 
instance, consider refining an "abstract" list module to a "concrete" set module, 
where the coupling invariant is just that the set holds all the elements in the list 
(thus losing information about how many times an element appears in the list, 
and the order of elements in the list). We can implement procedures for adding 
elements and checking membership easily, however we would not expect to be able 
to implement a count procedure, which returns the number of times an element 
appears in the list. Accordingly, we will not be able to prove H5.3|l for the count 
procedure with the chosen coupling invariant and concrete type. 

In general, a carefully chosen coupling invariant will ensure (|5.2() and H5.3() hold. 
In particular, a coupling invariant in which the abstract value is some function of 
the concrete, i.e., / = «/(/+), will always ensure that (|5.3() holds. This is because 
all occurrences of the abstract input can be replaced with a/(/+). 

5.1.1 Simplifying the specification 

In practice, the calculated specification of a concrete procedure will be simpler than 
the general form given in 1)5. 4|l . From (|5.3(l we know that the right-hand side of the 
implication can be expressed as R{ V, I'^ , 0+), which does not contain I or free. 
Making this simplification, and reducing the scope of / gives the following simpler 
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specification part for p+ in H5.4() . 

((3 / • CI{1, 1+) AA)^R{ V, I+, 0+)) 

The left-lrand side of the imphcation matches the assumption from H5.4|l . Using the 
assumption and IjSiw^(equivalent specifications) the specification may be simplified 
to just 

{R{V,I+,0+)) 

Thus, in practice, once the specification has been calculated, it is just a matter of 
simplifying (3 • _P A CI{0, 0^)) to eliminate references to /. Then the universal 
quantification over / becomes redundant. 

5.1.2 Initialisation and observer 

The following are instances of H5.4|l simplified for initialisations (no opaque inputs) 
and observers (no opaque outputs), respectively. 

{ A}, {{30 • P A CI (0,0+))) (5.5) 
{{31 • CI{I,I+) A ^)},((V/ • CI{I,I+) AA^P)) (5.6) 

A consequence of there being no inputs for initialisations is that 1)5. 3|l can be trivially 
satisfied by choosing i? to be (3 O • P A CI{0, O^)). Since there are no outputs 
for an observer there is no need to check 1)5. 2|l . 

5.2 Example 

In Sect. we provided a proof that a concrete implementation of update from 
Fig. 121 satisfies Condition l4.6l Here we use Theorem l5.1l to calculate an implementa- 
tion from the abstract update procedure and coupling invariant (|4.11() . We assume 
that (|5.2|) and (|5.3() hold (an example of discharging these formally will be shown 
later in Sect. I5.4TT|| . The concrete procedure in the form of (|5.4() is thus: 

{{3F m H = makehash{F) A F e pfun A K Ea A V E r)}, 
{{WF • H = makehash{F) AF e pfun A K Ea A V Et ^ 
(3 F' • H' ^ makehash{F') A F' ^ F ® {{K, V)}))) 

From (|4.12|l we may use Law ^ {weaken assumption) to refine the calculated 
assumption. 

{H E hashtable AKEaAVEr} 

We simplify the specification by applying the one-point law to F' . 

{{yF»H^ makehash{F) AF E pfun AK Ea A V Et ^ 

W = makehash{F ® {{K , V)}))) 

In Sect. 14.41 we showed 

makehash{F © {{K, V)}) = H® {{hash{K), V)} 
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Therefore we may rewrite the bottom hne of the specification as 

H' H (S {{hash{K), V)} 

We have ehminated references to the abstract input F on the right-hand side of 
the imphcation. We therefore employ the simphfication mentioned in Sect. l5.1~n to 
ehminate the quantification over F, resulting in the following program (identical to 
that given in Sect. 14.4(1 . 

{H e hashtable A K e a A V e t}, {H' ^ H ® {{hash{K), V)}) 

5.3 Specialisations 

In this section we provide some specialisations of (|5.4f) . based on the form of the 
coupling invariant and the abstract procedure. The specialisations are partitioned 
based on two factors. Firstly, whether or not the abstract procedure {A], (P) is 
deterministic. In a deterministic procedure there is only one possible abstract output 
value given any regular and input parameter values, i.e., P is of the form O ~ 
f{V,I). In a nondeterministic procedure, there could be many possible output 
values related to any given regular regular and input values (which we therefore 
write as a ternary relation P{ V, I, O)). 

Secondly, we partition the specialisations based on the form of the coupling in- 
variant. We consider the case where the abstract variable is some (abstraction) 
function of the concrete, / = a/ (/"*"). In this situation, there are potentially many 
concrete representations of an abstract value, though each concrete value represents 
exactly one abstract value. This is a common form of coupling invariant, and often 
simplifies the data refinement process. The second form of coupling invariant we 
consider is when the concrete variable is some (concretisation) function of the ab- 
stract variable, — cf(I). Thus each concrete value may represent many abstract 
values, though each abstract value has exactly one concrete representation. Finally 
we consider the case where the coupling invariant is a relation between the abstract 
and concrete variables, C/(/, 

The specialisations are summarised in Fig. 0] The predicates in the cells of the 
table are obtained from ((5.4(1 by simplifying using the one-point rule. Most of the 
transformations are straightforward, however the case where the abstract proce- 
dure is deterministic and the coupling invariant involves an abstraction function is 
discussed in more detail in Sect.El 

5.4 Example: hash table 

In this section we use the derivation process to derive an array (hashtable) imple- 
mentation of the abstract partial function type given in Fig. El Recall the coupling 
invariant: 

H = {i: 0..N - 1 • (i, null)} {{K, V): F • {hash{K), V)} lUTfT 

This coupling invariant is a concretisation function (the concrete variable _ff is a 
function of the abstract variable F). There is an equivalent abstraction function 
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Abstract procedure 



Coupling 
invariant 


Deterministic 
0=f{.VJ) 


Non-deterministic 
P{V,I,0) 


I = afin 

1+ = cf{I) 
CI{I,I+) 


afiO^)=f{V,af{r)) 
(V/ . /+ = cf{I) /\A^ 
0+ = cf{f{VJ))) 

(V/ • ci{i,i+) aa^ 

CI{f{V,I),0+)) 


P{V,af{r),af{0-^)) 
no simplification 

no simplification 



Fig. 4. Specialisations for concrete specification P 



form (see (|5.7(l below), but we prefer to use H4.11|l for the simplifications it provides 
in the calculation process. When the abstract procedure is deterministic we may 
use the simplification from the second row of Fig. ^ The calculated procedures can 
all be implemented efficiently in the logic programming language Mercury. 



5.^.1 Side conditions 

Before beginning the derivation, we check that the conditions H5.2|l and (|5.3|l hold 
for each procedure in the module. Condition H5.2|l requires that the coupling invari- 
ant on the inputs, as well as A and P of the abstract procedures, imply (3 0+ • 
CI{0, 0~^)). Instantiating the quantification for the hash table example gives the 
following condition which trivially holds: 

{3H • H = {i: 0..N - I • (i, null)} © {{K, V): F • {hash{K), V)}) 

It is easily seen that each abstract procedure guarantees that the type of its output 
parameter is of type pfun, and therefore H5.2(l holds for all the procedures in the 
module. 

To satisfy (|5.3() we must be able to eliminate references to the abstract type. As 
mentioned earlier, this side condition is normally satisfied implicitly in the deriva- 
tion process, since in any case we wish to eliminate the abstract variable. However, 
we note that in this case there is an equivalent coupling invariant that we could 
employ: 

F = {\K:a\ H{hash{K)) ^ null • H{hash{K))) (5.7) 

This coupling invariant expresses the abstract variable f as a function of the con- 
crete variable H . The function is constructed by taking all keys K of type a which 
are not mapped to null by the hash table H (the notation ' | ' is used to restrict the 
domain of a function); all such keys are then mapped to their (non-null) value in 
the hash table. Because the relationship between the abstract and concrete vari- 
ables is one-to-one, references to the abstract input can always be eliminated by 
replacing them with the right-hand side of the equality in 1)5. 7|l . We may therefore 
automatically discharge 1)5. 3|l for each procedure in the module. 
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5.4-2 Assumptions 

In the partial function module, the assumptions of the procedures are that the 
input and regular parameters are of the correct type. From H5.4(l we calculate the 
concrete assumption for the update procedure. 

{3F • H = makehash{F) A F e pfun AK Ea A V £ t) 

From (|4.12() we may use Law ^{weaken assumption) to refine the calculated as- 
sumption. 

H e hashtable A K e <t A V Et 

Using similar manipulation, the assumption of each concrete procedure is refined 
to the corresponding abstract assumption, except with H E hashtable in place 
F E pfun. We now calculate the specification of each concrete procedure, and refine 
the specification to code (with the exception of update'^ which was dealt with in 
Sect. ESI. 

5.4-3 Procedure init 

Since the init procedure is a deterministic initialisation and the coupling invariant 
(j4.11|l is also deterministic, we may immediately use the simplification in the second 
row of Fig. 0] with /( V , I) = {}. Furthermore there are no inputs, eliminating the 
quantification over /. 

H' = {i: 0..N - I • (i, null)} ® {{K, V): {} • {hash{K), V)} 

The rightmost set comprehension is just the empty set, and therefore the function 
override has no effect. 

H' = {i:{)..N - I • (i,null)} 

In other words, every element in the array is initialised to null. 

5.4-4 Procedure remove 

This is a deterministic procedure, and we use the specialisation in the second row 
of Fig. H In this case /( V, I) is { O /. 

{\/ F • H ^ makehash{F) A F E pfun A K E (t ^ 

H' = {i: 0..N - 1 • (i, null)} © {(X, Y): {K} <i F • {hash{X), Y)}) 

We simplify the equality on the bottom line. Since F is a function and hash is an 
injection, it is equivalent to 

H' = {i: 0..N - 1 • (z, null)} © {{hash{K)} <3 {(X, Y): F • (hash{X), Y)} 

Therefore hash{K) must map to null in H' . 

H' = {{i:O..N - 1 • (i,null)} © 

{{X, Y): F • {hash{X), Y)}) © {{hash{K), null)} 
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We rewrite using the makehash function, and make the antecedent exphcit again. 

(VF • = makehash{F) A F e pfun A K Ga ^ 

H' = makehash{F) © {{hash{K), null)}) 

From the antecedent we replace makehash(F) with H , ehminating the reference to 
the abstract input F on the right-hand side of the impHcation. We therefore use 
the simpHfication in Scct. l5.lT1 to eUminate the quantification over F and complete 
the refinement. 

H' = H ® {{hash{K),nu\\)} 

5.4.5 Procedure access 

Since access is an observer we instantiate (|5.6(l . 

(y F • H = makehash{F) A F e pfun AK ea ^ 
K e dom(F) AV = F{K)) 

We manipulate the bottom line. 

K e dom(F) A V = F{K) 
Given the assumptions and K G dom(F), F{K) = H{hash{K)). 

K e dom(F) AV ^ H{hash{K)) 
We have that K G dom(F) is equivalent to H{hash{K)) ^ null using (|5.7(l . 

H{hash{K)) ^ null AV = H{hash{K)) 
We simplify. 

V ^ null A V ^ H{hash{K)) 

As with remove ^ we have eliminated the abstract input from the conclusion of the 
implication, and therefore eliminate the quantification over F as in Sect. 15.1.11 As 
expected, the procedure fails rather than return null for V when K is not in the 
domain. The full module is given in Fig. [S] 

6 Non-determinism in module derivations 

When dealing with refinement of opaque types (in which the representation of the 
opaque type is not directly visible), there may be multiple concrete representa- 
tions of an opaque type variable which are equivalent in terms of the abstract 
specification. Thus if we choose any one of those representations, the behaviour of 
the operations on that representation will meet the requirements of the abstract 
specification. Hence for an opaque variable only one representation from a set of 
equivalent representations needs to be chosen. This corresponds to don't care or 
demonic nondeterminism. At the same time, the abstract specification of an oper- 
ation may involve don't know nondeterminism, where multiple answers, provided 
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Module Hashtable 

Type hashtable = Q..N - 1 ^ (r U {null}) 
init: H: hashtableo 

add: K:a, F: r, hashtable^, hashtableo 
access: K: a, H:hashtah\ei, V: t 
remove: K: a, H: hashtable;, H': hashtableo 

imt= H -.-{H = {i:O..N {i,nu\\)}) 
update = {K,V,H,H') :- G hashtable A if G ct A F e r}, 

{H' = H ®{{hash{K),V)}) 
access = {K, H, V) :- {H G hashtable A K £ a}, 

(V / null A V = H{hash{K))) 
remove = {K,H,H'):- {_ff G hashtable A if G a}, 

{H' = H ® {{hash{K),r\u\\)}) 

End 

Assume the constants hash and N such that hash uniquely maps elements of type o" to a 
natural number in the range 0..A'' — 1. 

Fig. 5. Concrete partial function module 



via regular (non-opaque) variables, are possible. Hence in order to handle the in- 
formation hiding aspects of opaque variables within the logic programming context 
we need a framework that handles both don't know and don't care nondeterminism; 
not just don't know (as in standard logic programming) and not just don't care (as 
in concurrent logic programming | |Shapiro"T989) ). 

In this section we apply the basic principles of demonic nondeterminism to mod- 
ule calculation. We apply them to a particular combination of abstract procedure 
and coupling invariant, for which the calculation method presented in Sect, pleads 
to a procedure that may produce many different answers for the concrete output 
parameters, though we "don't care" which one is chosen. This reduction in nonde- 
terminism (in the choice of concrete value) will typically lead to a more efficient 
concrete module. 



6. 1 Deterministic abstract procedure and an abstraction function 

Consider the specialisation in the top- left entry in Fig. ^ where we have a deter- 
ministic abstract procedure and an abstraction function as the coupling invariant. 
The calculated value of F+ will be 

«/(0+)=/(F,a/(/+)) (6.1) 

For example, this specialisation can occur when representing a set 5* as a list L. 
Assume the existence of a module providing the opaque type set and some basic 
operations on sets, including a procedure, add, for adding an element to a set (such 



a module can be found inlColvin et al. (2001 1) 



add = (E, S, S') :- {S' = {E} U S) 
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To represent the set as a list we choose the couphng invariant to be the abstraction 
function S — ran(L), where 'ran' returns the range or set of elements in a list. Using 
1)6. 1() we calculate the corresponding concrete procedure. 

add+ = {E, L, L') :- (ran(L') = {E} U ran(L)) 

This procedure outputs a list L' such that the elements of L' are the elements of 
L plus E. While this is valid, there are an infinite number of such lists because an 
element is not precluded from appearing multiple times in L' . Typically this will 
not be a practical implementation of the add procedure for lists. 

Intuitively, however, since there is exactly one abstract output, there need only 
be one concrete output. In other words, the calculated value for F+ should be of 
the form 0+ = U, for some term U. In fact, any term U with V and /+ free that 
satisfies the following condition validates 0+ = U as an implementation for F+. 

af{U)^f{V,af{I+)) (6.2) 

This may be proved by substituting 0+ = U for P+ in Condition 14.61 and simpli- 
fying (strengthening). 

In the set-as-list example, we require some value U for L' such that ran(?7) = 
Uran(L). One obvious choice is [E \ L\. Clearly, ra.n{[E \ L]) = {E}UTa.n{L). 
Thus, we are free to implement the concrete version of add as {L' ~ [E \ L]), which 
is a stronger constraint on L' than that calculated by H6.1|l . To formalise the choice 
for U we introduce demonic nondeterminism. 



6.2 Demonic nondeterminism 

In 



Hemer et al. (2002 1 a demonic choice operator (n) and its associated semantics 
and refinement laws are added to the refinement calculus. This allows the wide- 
spectrum language to express the don't care interpretation of nondeterminism using 
n, as well as the the default don't know interpretation of nondeterminism, within a 
single program. To understand the difference, consider the program 5" n T. It may 
be implemented by either of the programs S or T, as embodied in the following 
refinement laws: 

5n T □ 5 
SnT HT 

Note the difference with program disjunction, where S V T must be implemented 
by returning the answers for both S and T. For example, consider the program 

(X = 0) n (X = 1) 

This program is implemented by either the program {X = 0) or the program 
{X = 1). In contrast, the program (X = 0) V (X = 1) is not; it must return both 
answers for X . 

The identity of demonic choice is the program magic, that is, (magic n S) = 
(5 n magic) — S. It is the (unimplementable) program that refines all other pro- 
grams. 
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We may generalise the binary operator: given a program S with free variable X, 
the demonic choice between the set of programs formed by instantiating S with 
each possible value of X is given by 

nx • s{x) 

This program is refined by S{U), for all terms U. We may limit the range of X by 
introducing a guard. A guarded command 5* ^ T is magic if S fails, but behaves 
like S, T otherwise. To restrict the range of X to just those terms that satisfy some 
predicate P, we write 

nx • {P{x)) -> s{x) 

For example, a program that picks exactly one arbitrary element from a set A and 
sets some variable Y to have that value is: 

nx • {X e A) ^ {Y ^ X) 

This is in contrast to the program {Y e A), which binds Y to every element of A. 

A generalised demonic choice over P{X) S{X) is implemented by S{U) for 
all terms U that satisfy P{U). This is embodied in the following refinement law. 

Law 8 {Eliminate generalised demonic choice ) 

nu) 

{nx.{p{x))^s{x))\=s{u) 

We may refine a program _D to a generalised demonic choice if, for all terms X 
such that P{X) holds, D is refined by S{X). This is expressed by the following 
refinement law. 

Law 9 {Introduce generalised demonic choice) 
(VX . P{X) ^{DQ S{X)) 
D\z{nx • [p{x)) S{X)) 

6.3 Demonic choice in module calculation 

When there is only one abstract output value for a procedure, i.e., when it is de- 
terministic, we will typically want the corresponding concrete procedure to also 
be deterministic. In other words, when p is of the form {^},(0 = f{V,I)) for 
some assumption A and function /, the corresponding p^ should be of the form 
{^^}, {O^ = U), where A'^ is the calculated assumption and U is some term 
involving V and I^ . However, when the coupling invariant allows many concrete 
representations of an abstract value, i.e., when the coupling invariant is an abstrac- 
tion function of the form /"*" = af{I), the applicable derivation specialisation (top 
left in Fig. ^ is not deterministic for 0+. 

We solve this problem using demonic nondeterminism. Recall from Sect. 16. l1 that 
(0+ = X) is a valid implementation of p'^ for all terms X that satisfy H6.2(l . 
Expressing this formally: 

(VX . af{X)^f{V, a/(/+)) ^ {p+{V,I+, 0+) □ {A+}, {0+ = X))) 
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From this, using haw^ {introduce generalised demonic choice) we may deduce 
p+{V, /+, 0+) C {nX . {af{X) = f{V, af {!+))) {A+}, {0+ = X)) (6.3) 

This specification allows more flexibility in the final implementation of the con- 
crete procedure than the specification originally calculated (top left in Fig.0J. The 
implementor may choose any term U such that af{U) — f{V, «/(/+)), and from 
haw^ (eliminate generalised demonic choice) the actual implementation of p+ be- 
comes 0^ — U . Without the reduction of nondcterminism, the implementor must 
retain each concrete value that corresponds to the abstract output. 

In the set-as-list example, we would instantiate H6.3II to calculate the list imple- 
mentation of add^ . 

nX • (ran(X) = {E} U ran(L)) -> {L' ^ X) 

To refine this to code we choose some value for X that satisfies the guard. An 
obvious choice is [E \ L]. It may be easily seen that ran([i? | L]) = {E} U ran(iy), 
which is the proof obligation for applying Law |H1 (eZimmaie generalised demonic 
choice). We can therefore implement add^ as (i?, L, L') :- [LI = [E \ L]). 



7 Related work 

There is a large body of work on the deductive synthesis of logic programs, a sur- 



vey of which appears in Basin et al. (20041. Deductive synthesis is a method for 



deriving a logic program from a specification, similar to the refinement calculus ap- 
proach. A specification is manipulated using deduction rules (that are proved correct 
within the proof framework), until an executable program is reached. The various 
approaches to deductive synthesis vary mainly in their specification language; how- 
ever, most use first-order logic since this can express both specifications and logic 
programming code. One of the most developed schemes for deductive synthesis is 



that of Lau and Ornaghi (1997b I. They introduce a specification framework, which 
underlies the synthesis steps, providing axioms and derived relations. 

The main difference between most deductive synthesis approaches and logic pro- 
gram refinement is the inclusion of assumptions in the wide-spectrum language. 



These act as preconditions, providing a context for refinement steps. Lau and Ornaghi (1997b I 
have a conditional specification, which includes an input relation for a procedure 
(e.g., types, modes) with respect to which the synthesis of the procedure can take 
place. The refinement calculus generalises this by allowing an assumption (input 
relation) for any arbitrary program fragment. A further difference is that in de- 
ductive synthesis the deduction rules are derived with the SLD computation rule 
in mind. Thus issues such as clause-ordering are dealt with during the synthe- 
sis process. The refinement approach defers such issues to a separate translation 
phase, where a particular implementation language (and computational model) are 
chosen and the wide-spectrum program is translated into code for that language. 
A translation scheme for Mercury programs ( |Somogyi et al. 1995| ) is described in 



Colvin et al. (2002 k 



Despite these differences, much of the work on logic program development in the 
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synthesis world should be applicable in the refinement calculus. The refinement cal- 
culus work has focused mainly on the process of developing logic programs, while 
much of the synthesis work has been developing strategies for deriving programs 
given particular forms of specification. We expect that such strategies can be for- 
mulated as sequences of refinement rules. 

The examples in Sect.|31draw on work on Prolog program transformations ( [Sterling and Shapiro 1994| ), 
in particular, transformations between the Prolog types list and difference list 
( |Ma rriot an d S0ndergaard 1988| ). The relationship between the list and difference 
list implementations of reverse may also be defined with respect to higher-order 
program synthesis, as shown by Seres and Spivey (20001. 

Specifications of procedures and modules in our wide-spectrum language (Sect.QJ 
are similar to Morgan's model-based module specifications for imperative programs 
HMorgan 1994| ), though in his case the modules provide a 'hidden' state (rather than 
type), which is not possible in traditional logic programs. Bancroft and Hayes (1993 1 
have extended the imperative calculus to include module specifications with opaque 
types similar to ours. Our module specifications are similar to the module declara- 
tions of languages such as Mercury jSomogyi et al. 19'95l ). 

There are many other existing logic programming frameworks for modules or 
module-like encapsulation, e.g., dSrinivas and Jullig 1995|pl/au and Ornaghi 1997a| 
ILau et al. 1999|l . Many of these define modules through the algebraic specifica- 
tion of abstract data types (ADTs) l|Turski and Maibaum 1987jl . An implemen- 
tation module may be derived by ensuring it maintains the axioms of the ADT. 



Read and Kazmierczak (19921 present a particular method of developing modular 



Prolog programs from axiomatic specifications. They write their programs in a mod- 
ule system based on that of extended ML. The specification of a module is written 
in the form of a set of axioms stating the required properties of the procedures of 
the module. To define the semantics of refinement, Prolog programs are considered 
to be equivalent to their predicate completions. The definition of module refinement 
in their approach is more general than the technique presented in this paper: any 
implementation that satisfies the axioms is valid (cf., interpretations between the- 
ories from logic l|Turski and Maibaum 1987|l '). However, for modules with a large 
number of procedures, presenting an axiomatic specification of how the procedures 
interrelate is more problematic than with the model-based approach used in this 
paper. This is because axioms are required to define the possible interactions be- 
tween procedures, whereas, in the approach used in this paper, each procedure is 
defined directly in terms of the model of the opaque type. In the algebraic approach, 
the proof of correctness amounts to showing that all the axioms of the specification 
hold for the implementation l|Read and Kazmierczak 1992|> . For a module with a 
large number of procedures this can be quite complex. In comparison, the approach 
presented here breaks down the problem into data refinement of each procedure in 
isolation. 

Imperative data refinement ( [Morgan 1994| ) has more similarities with our ap- 
proach to module refinement. In that framework, a specification is augmented with 
the concrete variable and the coupling invariant, then refinement proceeds as nor- 
mal, until the abstract variable is removed via diminution. Neither of the augment 
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and diminish steps are actual refinements, but as in our framework the resulting 
relationship between the abstract and concrete procedures is guaranteed to satisfy 
the conditions for data refinement. 

The calculational method for deriving a concrete module from an abstract mod- 
ule and a coupling invariant in Sect. |5l is similar in style to that presented by 



Morgan and Gardiner (19901. The calculated concrete procedures can appear quite 



complex in both methods (' (|5.4() in this paper and Lemma 3 in ( [Morgan and Gardiner 1990| )). 
However in the common situation in which the coupling invariant is an abstraction 
function, that is, the abstract value is a function of the concrete value, the one-point 
rules can be applied to simplify the calculated procedures to term replacements on 
the abstract procedure. These simplifications can occur in both settings. In either 
case, the bulk of the work revolves around eliminating the existentially quantified 
abstract state, and hence many data refinement techniques should be applicable 
in both settings. In the terminology of Morgan and Gardiner (19901, our calcu- 
lated concrete procedure is valid, that is, it is a module refinement of the abstract. 
However, it is not general (unlike the imperative calculated concrete procedure), 
because there are other valid concrete procedures that are not (algorithmic) refine- 
ments of the calculated procedure. This necessitated the introduction of demonic 
nondeterminsm into the calculation process in Sect.El 



8 Conclusions 

This paper has described a cohesive framework for contextual refinement, mod- 
ule refinement, and the calculation of concrete modules. Contextual information 
simplifies the refinement process by allowing individual refinement steps and proof 
obligations to operate on the predicate level, with minimal reference to the struc- 
ture of the program. Contextual information is collected via monotonicity laws, 
which not only simplifies proofs "by-hand" , but can also be made transparent to 
the user when using a refinement tool IjHemer et al. 20dT|l . The contextual laws 
presented in Sect. O have been used to develop a solution to the N- queens problem 
IjColvin 2 002 Chapter 4), and also in the development of a term unification algo- 
rithm IjColvin et al. 20d4|l . In this paper we make use of contextual information in 
providing laws for module refinement and calculation in a more convenient form. 

Modules are an extension of the refinement calculus that allows data abstraction 
and encapsulation. In Sect. ^ we investigated an implementation of the module 
specifying a partial function type in Fig. El The partial function module has also 
been used in the development of a term unification algorithm IjColvin et al. 2004)l . 
The module calculation approach in Sect. El can be used to automatically derive 
a concrete module from an abstract module and coupling invariant. The calcu- 
lated module is guaranteed to satisfy the conditions for module refinement, thus 
automatically discharging the proof obligations associated with module refinement. 
However, while the calculated module is a valid module refinement, there are in 
general many valid module refinements, some of which may be more efficient than 
the calculated version. This can occur in the common situation where the abstract 
procedure is deterministic and there are many possible concrete representations of 
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an abstract value. To overcome this problem, in Sect.|Blwe introduced a demonic, 
or don't care, nondeterministic operator into the calculation process. This approach 
can be used to eliminate unwanted nondeterminism introduced by the coupling 
invariant. 
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